Connecting with Apify
The goal of this document
Here, we are going to connect the vendor called Apify to the Airbyte then it will redirect the data using the destination of MySQL, User can choose the format of the destination as required, For example from MySQL/ PostgreSQL, etc.
Useful Resources of Airbyte
- Official Website: https://airbyte.com/
- List of Available sources by the Airbyte: https://docs.airbyte.com/integrations/sources
- Help Center: https://discuss.airbyte.io/new
- How do you install Airbyte: https://docs.airbyte.com/deploying-airbyte/local-deployment
Requirements to get the ‘apify’ source
- User should have an account on https://console.apify.com/sign-in, if not then signup from the below link: https://console.apify.com/sign-up
- User should have the Personal Access Token of Apify.
Steps to Fetch the Dataset ID from apify.com
Here, we are going to fetch the Dataset Id instead of API key to make connection between the vendor and Airbyte.
- To getch the Access Token of Apify, you should go to https://console.apify.com/sign-up
- You need to login with your credentials.
- Click on Storage option from left-menubar, as shown in figure below:
- After getting on the storage page, click on datasets button to get in the dataset section of apify.com
Steps to Connect apify.com with AIV
Go to the Airbyte using the below link: https://airbyte.com/
The Airbyte landing screen looks like below:
- Here We have 3 options on the left, which contain Connections, Sources, and Destination. As per continuing the steps click on the Sources option.
- To add the apify source, click on the New Source button from the Top-right corner.
- To add Any of the sources here, the user needs to add the properties which are needed from the Airbyte, every source has its own individual properties.
Note: Here, every source demands a different type of property, where Client DI, Client Secret, and Refresh Tokens are common in most sources. on that we can say, all the sources are different from each other.
- Set up source Form overview:
- Name: Users can add the name of the source as per their requirements.
- Source Type: The user has to select the source type from the provided list by airbyte.
- Dataset ID: Fetch the dataset id from apify.com and add here.
- Now, set up the source here,
- Add the name as Apify,
- Select Apify Dataset from the source,
- Add Dataset ID in the form,
- Validate your form with as shown in the figure below:
• Now click on Set up Connection, it will test the connection here, then it will redirect to add destination of the source, as per figure below:
Test connection successful alert.
- After testing the connection successfully add the destination here.
Here, on the source adding screen,
- The top-menu bar shows two buttons where 1. for Overview of the source and 2. for the settings of the source.
- Overview: The overview screen shows the details related to the source; it may be empty as per the above screen.
- Settings: From settings, the user can edit the form details of the added source.
- Add destination button leads us to the destination page from this source page.
- Click on the Destination from the Top-right corner, as shown in the figure below:
It’s will show the available destinations, here user can add the new destination as per required.
To know more about adding the required destination in Airbyte go to the Destination
- Click on the MySQL Training here. it may take some time to fetch the stream names, as shown in figure below:
- After loading all the Streams form the Vendor’s to the destination, the destination page looks as shown in the figure below:
- After loading all the data streams user need to add Sync frequency and Table Prifix, as shown in the figure below:
- Add Sync Frequency: Manual
- Add Table Prefix: Apify_
- As shown in the figure below:
• When the user add the prefix, it gets added on Destination stream name automatically, as shown in the figure below:
- At the bottom of the destination page, we have two radio buttons, which contain the Normalization option. Keep the Basic Normalization selected.
Now, click on the setup connection button to complete the connection. it will test the connection again which may take some time.
Now, to validate the data is synced by the Airbyte, Go to the Connection from the Left menu bar of the screen:
- Find the Added connection of the Apify from the list,
Here, the Data isn’t synced yet, for that it shows the icon of non-sync at the start of the row. click on the row to see the details.
- Here the Apify connection page will show the Status of the data sync and sync history.
- The title of the Source and destination name.
- Status and settings page top menu bar.
- Enable button: from this button, the user can enable/disable the source from the destination.
- Reset your data and Sync button: the user can reset their data and only sync the updated data from the vendor.
- Click on the sync button to sync the data manually, after clicking on the sync button it will start the process which will indicate the status under the history grid.
Once it gets completed it will indicate as the Succeeded under the history, as it will also indicate the size of data records number and time of sync. Here we have added the sync time as every hour, on that the Airbyte will sync the data in every hour automatically which will be shown as the figure below: